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PROPERTIES  OF  SOME  PRELIMINARY  TEST  ESTIMATORS 
IN  REGRESSION  USING  A  QUADRATIC  LOSS  CRITERION 

M.E.  Bock,  T.A.  Yancey,  G.G.  Judge* 
University  of  Illinois  at  Urbana-Champaign 


This  study  is  concerned  with  deriving  the  properties  of  the 
preliminary  test  estimator  for  the  general  linear  normal  regression 
model  and  determining  the  conditions  necessary  for  the  risk  of  this 
estimator  to  exceed  or  be  less  than  the  conventional  one  under  a 
quadratic  loss  criterion.  A  test  procedure  and  the  problem  of 
choosing  an  optimal  level  of  significance  for  the  test  aie  discussed. 


1 .  Introduction 

In  much  of  the  work  concerned  with  estimating  the  parameters  of  behav- 
ioral and  technical  relations,  there  is  uncertainty  as  to  the  appropriate 
model  to  be  used.  As  a  consequence,  investigators  begin  with  an  initial  set 
of  specifications  and  then  modify  their  models  by  testing  the  statistical 
significance  of  some  or  all  of  a  class  of  h;'/potheses.  This  process  makes  the 
model  and  thus  the  estimation  procedurii  dependent  on  the  outcome  of  the  tests 
of  hypotheses  and  leads  to,  what  has  been  termed  in  the  literature,  prelimi- 
nary test  or  sequential  estimators .  Fortunately,  this  class  of  statistical 
procedures  has  been  studied,  starting  with  Bancroft  in  the  early  1940' s,  by 
Mosteller  (1948),  Kitagawa  (1963) ,'  Kuntsberger  (1965),  Larson  and  Bancroft 
(1963a, 1963b) ,  Bancroft  (1964),  to  determine  the  properties  of  the  resulting 
statistics  in  terms  of  their  means  and  mean  square  errors .  Cohen  (1965) 
showed  that  under  certain  assumptions  for  estimation  with  quadratic  loss. 


*The  authors  have  benefited  from  papers  by,  and  comments  from,  T.D. 
Wallace,  S.L.  Sclove  and  T.A.  Bancroft. 


the  preliminary  test  estimator  is  inadmissible.  Unfortunately,  he  did  not 
suggest  a  superior  estimator.  Toro-Vizcarrondo  (1968)  and  Wallace  (1971) 
suggest  a  practical  procedui'e  for  determining  the  estimator  to  use  based  on 
a  test  of  compatibility  of  sample  and  exact  prior  information  in  a  regression 
model  and  in  so  doing  implied  a  preliminary  test  estimator.  Ashar  (1970) 
studied  the  conditional  omitted  variable  (preliminary  test)  estimator  for  the 
regression  model.   In  an  unpublished  paper,  Sclove  et^  al^.  (1970)  show  when 
certain  conditions  are  fulfilled  that  the  preliminary  test  estimator  is  dom- 
inated by  the  positive  part  version  of  the  James-Stein  (1961)  estimator. 
Unfortunately,  the  conclusions  flowing  from  this  result  are  of  limited  sig- 
nificance for  practitioners  since  (i)  only  the  orthonormal  regressor  case  is 
considered  and  extension  to  the  non-orthonormal  or  general  case  is  not  direct 
since  in  reparametrizing  the  model  the  measure  of  goodness  is  changed;  (ii) 
the  number  of  regressors  must  be  strictly  greater  than  2;    (iii)  the  critical 
value  of  the  test  statistic  is  constrained  to  lie  within  a  range  that  implies, 
for  the  usual  sample  sizes  and  numbers  of  regressors,  a  risk  function  very 
close  to  that  of  the  conventional  estimator;  (iv)  the  risk  for  the  positive 
part  estimator  is,  over  the  range  of  critical  test  values  that  are  appro- 
priate, approximately  equal  to  the  preliminary  test  estimator—  and  (v)  the 
risk  of  the  positive  part  and  preliminary  test  estimators  are  only  analyzed 
for  comparable  values  of  the  level  of  the  test.   In  addition,  Strawderman  and 
Cohen  (1971,  pp.  284-285)  have  shown,  following  the  results  of  Sacks  (1963), 
that  the  James-Stein  (1961)  estimator  fails  to  satisfy  the  conditions  neces- 
sary for  a  generalized  Bayes  estimator  and  thus  this  estimator  is  inadmissible. 
For  the  same  reason,  the  Stein-James  (1966)  positive  part  estimator  is  also 
inadmissible. 


—  See  Sclove,  et  al .  (1970,  p.  9) 


In  reviewing  the  literature,  it  would  appear  that  although  many  investi- 
gators have  not  understood  the  properties  of  the  preliminary  test  estimator 
or  the  possible  distortion  of  subsequent  inferences  from  the  use  of  a  prelim-, 
inary  test  of  significance  based  on  the  data  of  the  investigation,  this  esti- 
mator is  widely  used  in  practice.  Given  this  state  of  affairs,  a  study  of  the 
properties  of  the  estimator  and  the  characteristics  of  its  risk  function  under 
a  squared  error  loss  criterion,  are  of  interest  and  value.  Within  this  con- 
text,  the  purpose  of  this  paper,  which  is  to  a  large  degree  expository  in 
nature,  is  to  analyze,  for  the  general  linear  normal  regression  model,  (i) 
the  properties  of  the  preliminary  test  estimator  implied  by  a  two-stage  test- 
ing estimation  procedure;  (ii)  the  characteristics  of  the  risk  function  for 
the  preliminary  test  and  restricted  estimators;  (iii)  the  conditions  under 
which  the  risk  of  the  preliminary  test  estimator  is  greater  than,  less  than 
or  equal  to  the  conventional  and  restricted  estimators;  (iv)  the  decision 
problem  of  choosing  an  optimal  level  of  the  test  and  (v)  the  implications  of 
the  results  for  model  specification,  conditional  mean  forecasting  and  aggre- 
gation over  micro  relations  or  pooling  data. 

The  statistical  models,  estimator."?  and  tests  are  given  in  Section  2. 
The  risk  function  for  the  preliminary  test  estimator  is  derived  and  compared 
with  other  estimators  in  Sections  3  and  4.  The  optimal  choice  of  the  level 
of  the  test,  the  sampling  properties  of  the  sequential  estimator  and  the 
risk  for  the  conditional  mean  forecasting  case  is  given  in  Sections  5,  6  and 
7.  Some  theorems  and  lemmas  necessary  for  the  results  given  in  the  test  are 

given  in  the  Appendices. 

/ 
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2.  The  Statistical  Models  and  Estimators 

■  Assume  the  linear  hypothesis  model 

(2.1)  ^  =  X§  +  e, 

where  y  is  a  (T  x  1)  vector  of  observations,  X  is  a  (T  x  K)  matrix  of  non- 
stochastic  variables  of  rank  K,  3  is  a  (K  x  i)  vector  of  unknown  parameters 
and  e  is  a  (T  X  1)  vector  of  unobservable  normal  random  variables  with 

(2.2)  E(e)  '=  0     and     E(ee')  =  a^l, 

where  I  is  an  identity  matrix  of  order  T. 

Using  the  sample  information,  specifications  (2.1)  and  (2.2),  and  defin- 
ing S  =  X'X,  the  unrestricted  least  squares  estimator  is 

(2.3)  b  =  S'h'Y, 

where  b  is  distributed  normally  with 
(2.4a)    E(b)  =  §, 

(2.4b)    E(b-3)(b-§)'  =  aV-^, 

2 
and  an  unbiased  estimate  of  a  is  given  by 

.2     (y-Xb)'(rXb) 
(2.4c)    a   =  . 

T-K 

As  is  well  known  for  the  model  (2.1)  and  (2.2),  b  is  the  maximum  likelihood 
estimator,  and  is  unbiased. 

In  addition  to  the  sample  information  (2.1),  suppose  additional  informa- 
tion which  consists  of  J  linear  restrictions  is  perceived  as 

(2.5a)    R6  -  r  =  0, 

where  r  is  a  (J  x  i)  vector  of  known  elements,  R  is  a  (J  x  k)  known  matrix 


with  rank  J,  and  0  is  a  (J  x  1)  null  vector.  The  true  relationship  among 
parameters  is  assumed  to  be 

(2.5b)    R§  -  r  =  6, 

where  6  is  a  (J  x  i)  vector  representing  specification  errors  in  the  perceived 
information,  which  are  zero  if  that  information  is  correct. 

The  restricted  least  squares  estimator,  which  makes  use  of  both  the  sam- 
ple and  exact  p»ior  information  or  linear  hypotheses,  (2.1)  and  (2.5),  is 

(2.6)     §  =  b  -  S"'^R'(RS"-^R')'-^(Rb-r), 

where  Q   is  normally  distributed  with  mean 

(2.7a)    E(B)  =  §  -  S"-^R'(RS'-^R')''^6, 

variance 

(2.7b)    E(§-E§)(§-EB)'   =  a^[S'^   -  S"-^R'(RS"-^R')'-^RS'-^] 

and  mean  square  error 

(2.7c)    E(3-§)(§-6)'  =  o^S'^  -   aV'^R'(RS"-^R')"-^RS"-^ 

+  S~-^R' (RS'-^R')"'^6  5'(RS'-^R')"-^RS"^. 

If  the  restriction  hypotheses  are  correct,  6=0,  the  restricted  least 
squares  estimators  are  unbiased  and  have  smaller  variances  (mean  square  errors) 
than  do  the  unrestricted  least  squares  estimators.   If  the  prior  restrictions 
are  incorrect,  §  ^  0,  use  of  a  quadratic  loss  function  involves  a  trade-off 
between  variance  and  bias  and  results  in  the  following  risk  function  for  6: 

(2.7d)    £(§-§) •(§-§)   =  o^trS"^  -  a^trS'^R'(RS"^R')"^RS"^ 

+  trS'-^R'(RS'-^R')'-^§-6'(RS'-^R')"-^RS"^ 

Using  this  criterion  to  appraise  performance,  the  equality  restricted  estima- 
tor is  defined  to  be  better  than  the  unrestricted  estimator  if  (2.7d)  is 


smaller  than  the  trace  of  (2.4b). 

In  order  to  test  the  compatibility  of  the  sample  information  (2.1)  and 
the  linear  hypotheses  (2.5a),  it  is  conventional  to  use  the  test  statistic:. 

(2.8)     u  =  (Rb-r)'(RS"-^R')"-^(Rb-r)/Ja^. 

If  the  restrictions  (2.5a)  are  correct,  u  has  a  central  F  distribution  with  J 
and  T-K  degrees  of  freedom  and  conventional  two  stage  test  procedures  such  as 
those  found  in  Qhipman  and  Rao  (1964)  and  Rao  (1945)  may  be  used.  If  the 
linear  restrictions  are  incorrect,  u  is  distributed  as  a  non-central  F  dis- 
tribution with  J  and  T-K  degrees  of  freedom  and  non-centrality  parameter 


(2.9)     X  = 


5'(RS''^R')'-^6 
1? 


Wallace  (1971)  suggests  that  instead  of  using  the  traditional  test  and 
assuming  the  linear  restrictions  are  correct,  we  determine  values  of  X  for 
which  the  risk  of  the  restricted  estimator  (2.7d)  is  less  than  that  of  the 
unrestricted  estimator.  Given  these  critical  values  of  X,  a  parameter  in  the 
distribution  of  u,  Wallace  tests  whether  or  not  X  is  small  enough  to  insure 
that  the  risk  for  S  is  as  small. or  smaller  than  that  of  b.  Thus,  the  hypoth- 
esis, H  ,  that  X  is  less  than  or  equal  to  a  critical  value,  is  tested  against 


not  H  ,  by  using  u  and  rejecting  H,  ifu>F,._„,.=c.  The  value  of  c 
0  0      ~   (.Jji-K-tA  ; 

is  determined,  for  a  given  level  of  the  test,  a,  by 


/  dFX^(u)  =  a. 
c 

and  X  is  the  value  of  X  for  which  the  risk  of  the  restricted  estimator  is 

0 

less  than  the  unrestricted  estimator.  By  accepting  H  ,  we  take  3  as  our  esti- 
mate of  B,  and  by  rejecting  H  ,  we  use  the  unrestricted  least  squares  estimate. 
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In  either  the  conventional  or  Wallace  two  stage  testing  procedures, 
estimation  is  dependent  on  a  preliminary  test  of  significance,  which  implies 
the  use  of  the  preliminary  test  estimator, 

(2.10)  e  =  Ifo.c)"^"^^  *  ^[c,»)t^)^' 

where  Iv„  . (u)  and  Ir   ^ (u)  are  indicator  functions  which  are  one  if  u 
(0,c)'       [c,*)'' 

falls  in  the  interval  subscripted  and  zero  otherwise. 

It  is  useful,  for  the  development  of  the  risk  function  of  6,  to  write 
I  as 

(2.11)  6  =  b-I    .  (u)S"-^R'(RS"-^R')'-^R[b-S"-^R'(RS"'^R')"'^r]. 

Ky ,  C  J  -  ~ 

If,  as  is  the  case  in  much  of  applied  work,  we  follow  the  decision  rule 
suggested  by  conventional  testing  procedures  or  by  Wallace,  the  preliminary 

/^ 

test  estimator  6  results,  and  it  becomes  important  to  know  the  sampling  pro- 
perties of  this  estimator  and  its  performance  relative  to  the  conventional 
estimator  (2.4)  and  other  estimators  such  as  (2.6). 

3.  The  Risk  Function  of  the  Preliminary  Test  Estimator 

In  deriving  the  properties  of  the  preliminary  test  estimator,  3,  use  is 
made  of  the  following  quadratic  loss  function: 

(3.1)    L(§,§,)   =  |1§-§|1^  =  Ci-§)'ci-6), 

where  the  estimator  B  is  defined  by  (2.10),  and  its  risk  is 

(3.2a)    R(6,6,)    =  E[L(|,§,a2)]  =  E(§-6) '(§-§) . 

In  order  to  compare  the  risk  functions  of  different  estimators,  by  using 
methods  based  on  the  work  of  Stein  (1966)  and  Sclove  et  al^.  (1970) ,  it  is 
convenient  to  transform  the  random  variables  appearing  in  (3.2a)  and  .in  the 
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argument  of  the  test  statistic  (2.8),   It  is  to  this  sequence  of  transforma- 
tion that  we  now  turn. 

Using  6  derived  in  (2.11) ,  the  risk  function  of  §  becomes 


(3.3)     E(S-§) '(§-§)  =  E(b-S)'(b-6)  -  2E(b-§)'[I(o,c)^"^^' 


[b-S"-^R'(RS' 


+  EIj.Q^^j(u)[S"-^R'(RS" 


•[b-S"-^R'(RS' 
Rtb-S'^^R^CRS' 


R'(RS"'^R')"-^R 

R')"'^R 

R')"'^r]]'[S''^R'CRS'^f  f 
R')"^r]]. 


2-1      -1 
The  first  term  on  the  right  side  of  the  equality  is  a   trS   and  S   may  be 

written  using  P"  (P"'^)'  =  S"  .   In  addition,  an  orthogonal  transformation  Q 
is  chosen  to  diagonalize  the  idempotent  matrix  (P*  )'R'(RS'  R')"  RP"  ,  which 
is  of  rank  J,  giving  J  characteristic  roots  of  one  and  K-J  zero  roots.  Con- 
sequently, (3.3)  may  be  written  as 


(3.4)    E(3-6)'(6-6)  =  a^trS'-^ 


2EI^Q^^^(u)(QPb-QP§)'Q(P"b'P"^Q' 
•  [QPb-Q(p'-^)  'R'  (RS"'^R')"-^r] 


'b'^' 


.0  0, 


+  EIfO,c)  '^"^  [QPb-QCP'^)  'R'  (RS'^R')"^r]  • 


0  0 


•Q(P'^) 'P~^Q' 


h' 


[QPb-Q(P"-^)  'R'  (RS"-^R')"^r: 


where  I .  is  an  identity  matrix  of  order  J  and 


0  0 


0  Oj 

is  of  order  K.  Defining 


w  =  [QPb-Q(P"  )'R'CRS  R')'  r],  where  w  is  a  normally  distributed  random  vec- 
tor with  mean  n  =  QP§-Q(P"  ) 'R' (RS"  RM'^^r  and  covariance  matrix  a  I,  the 


risk  function  of  B  is 


(3.5) 
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E(6-S)'(6-e)  =  a^rS"^  -  2EI  ^^   ^^  (u)  (QPb-QP§)  • 


fA,  0]  : 


Al  0 


w 


where 


(3.6) 


Q(p-l)rp-lQ. 


*^^(0,c)f")^' 


\^3 


A^O 
0  0 


^^2 


and  A^  and  A-  al-e  of  order  J  and  K-J,  respectively. 

The  second  term  of  equation  (3,5)  may  be  written  as 


(3.7)     -2E[I^Q^^^(u)(w'-D') 


A,  0 


IA3O/ 


w. 


The  elements  of  w  are  independent.  Partitioning  the  (K  x  1)  vectors  w'  and  n' 
as  vectors  (wj  w')  and  (ri'  ri ') ,  each  with  J  and  K-J,  respectively,  the  risk 

/\ 

function  of  6  becomes 

(3.8)     E(B-§) '(§-§)   =  o^trS"-^  -  EI^q^^j(u)w|a^w^  -  2EI  ^^^^^  (u)w2A3Wj 

^  2D;AiEI^q^^^(u)Wj  *  2n2A;EI^Q^^^(u)w^. 

.A. 

The  evaluation  of  the  risk  function  of  6  now  requires  transforming  the  test 
statistic  u  to  a  function  of  wjw^. 

a .   A  Reformulation  of  the  Test  Statistic,  u 

Using  the  operations  and  notation  defined  above,  the  test  statistic  u 
may  be  written  as 
(3.9) 

_  [QPb-Q(P"b  'R'  (RS"^R')"^r]  'Q(P"^  'R'  (RS"^R')'^RP" V'  [QPb-Q(P^^)  'R'  (RS'^R')'^r] 


or 


■IO- 


CS.10)   u  = 


^1^1 


Jo 


I£  a  (J  X  J)  orthogonal  matrix  C.  is  chosen  so  that  C^A.Cj^  becomes  a  diagonal 
matrix  D, ,  the  test  statistic  u  may  be  expressed  as 


(3.11)    u 


u«'u' 


Ja' 


where  u*  =  C,w/  has  mean  §*  =  C,)],  and  variance  a  Ij. 

ft 
b.   A  Reformulation  of  the  Risk  Function  for  0 


The  test  statistic  presented  in  (3.10)  may  now  be  used  as  the  argument 
for  the  indicator  function  in  (3.8)  giving 


(3.12)    E(6-8)'(B-B)  =  a^trS"^  -  a^EI 


^1^1 


^1  .  ^1 


C0.c*)[-7-j-^-*  ^an^AiE 

•f'co.c*) 

-2 

Jo  c 
where  — =—  =  c*.  In  order  to  continue  the  evaluation  of  (3.12),  the  two  fol- 


'^l^l 


'^l. 


lowing  theorems,  proven  in  Appendix  A,  are  required: 

^1  '^l 

Theorem  1:  If  the  (J  x  1)  vector  ~  is  distributed  N(— ,1),  then 


E[I 


r   »  "^ 

w.,w. 


(0,c*) 


r-l^'^   =  P-^^\0,c*Xx,J^2)\^'^- 


^1 


.^1 


Theorem  2:  .  If  the  (J  x  i)  vector  ^  is  distributed  N(— ,1)  and  A^  is  a  posi- 
tive definite  symmetric  matrix,  then 


^^^(0,c*) 


^1^*1 


w,    w 

o 


'\"^ro']     -     E[I(o.c*)f^(X,J*2)^l^'j^^^l 


2,^1  .   '^l 


*^^^0,c*)t^(X,J*4))l<^  ^-^5-- 
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Utilizing  Theorems  1  and  2,  (3.12)  can  be  written  as 

(3.13)    E(i-6)'(|-§)  =  ?^trS"^  -  a^E[E[I^Q^^^^()C^^^j^2)^l^^l^^^^l 


-D:V,^[E[I(0,c*)^'(X.J.4))l^^3 


2      ...2^ 
2      ^1-2, 


20WDiE[E[ItO.c*)^^A.J*2)^l^31 


Recognizing  that 


*-"'T-K'' 


^(X,J+Jl) 


(T-K) 


=  Pr 


'•(X,J-i-il)   cJ_ 
2        T-K 
^(T-K) 


and  using  the  orthogonal  transformation  C.  again, 


(3.14)    E(e-3)'(^-8)  =  a^trS"^  -  a^Pr 


^x' 


(X,J-t-2)  <  cJ_ 
,2        T-K 
(T-K) 


; 


1=1 


(1) 


i=l  '       ^ 


♦  2  Z  d!^^^C*^Pr 


i=l 


1   1 


'(X.J^4)  ,  cJ 
2         T-K 
^(T-K) 

^(X,J^2)  ,  cJ 

2        T-K 
'^(T-K) 


where  d|-  -^  and  C^  are  the  characteristic  roots  of  A,  and  the  elements  of  §*, 
respectively. 

Furthermore,  the  non-centrality  pareimeter  for  the  distribution  of  u, 
given  in  (2.9),  can  be  written  as: 
J   ^ 


(2.9a) 


1=1 

20^ 


c.   Characteristics  of  the  Risk  Function 


If  in  line  with  the  specifications  in  the  previous  section,  and  by  defining 


^     /X?T  L.A  <  cJ/(T-K))  and  t.  =  d!^^V  2  d^^^  which  implies 

^   j=l  J 


\(^)  =  P^(^(X,J*il)/X(T-K) 


■12- 


•e^ 


t.  >  0  and  E  t^  =  1,  (3.14)  may  be  expressed  as 
i=l  ^ 

(3.15)    E(e-3)'(l-B)  =  a^trS"^  -  a^(  Z  d^^)  [h,  C2)r2Ch,  (4)-2h,  (2))  I   t, 
-  -   -  -  i=l  ^     ^      ^     ^    i=l  ^ 


.*2^ 


2a 


Written  in  this  form,  the  risk  of  the  preliminary  test  estimator  (3.16)  is  seen 

"^    *2   2 
to  be  a  function  of  both  X   and  E  t.^.  /2a  ,   where  X   appears  through  the 

1=1 
functions  h  (£) .  ITius,  a  given  value  of  X  does  not  com- 

pletely  determine  the  risk  functions  of  g,  and  one  must  know  in  addition  the 

0  0  0 

values  of  the  ^*  which  appear  in  Z  t.C*  /2a''. 

i=l 

In  order  to  determine  the  largest  and  smallest  risk  values  that  6  may  take 

J     ,2   2 
for  a  given  X  =     Z  ^1  /2a  ,   we  may  choose  t.  and  t„  as  the  t.  with  the  largest 

1=1 

*2 
and  smallest  values,  respectively,  and  vary  the  £,.    's.  The  value  of  the  risk 

*2 
function  (3.15)  is  largest,  for  a  given  A,  when  only  the  E,.      associated  with 

*2  2 

t,  ,  IS  non-zer^o  which  means  that  C,  =  2a  X.     Alternatively,  the  value  of  the 

risk  function  (3.15)  is  smallest  when  the  5^  are  varied  so  that  only  the  ^* 

*2     2  ''/ 
associated  with  tg  is  non-zero,  and  hence  ^^  =  2a  X.—   Thus,  given  X, 

(3.i6a)   E(6-§)'(§-§)   <  a^trS"-^  -  E  dp^a^[hj^(2)+2(h^(4)-2h^(2))Xt  J , 

i=l 

and 

(3.16b)   E(i-3)  •(§-§)  >  a^trS'V  -  Z  dP^a^[h^(2)+2Ch^(4)-2h^(2))Xt2] . 

i=l 


2/ 


Note  that  i  >  h,  (2)  =  Pr 


r   2  >i 

^2  T-K 

l^(T-K) 


>  Pr 


(X,Jt-4)  ^  cJ_ 


^(T-K) 


T-K 


h,(4) 


2  2 

>  0,  since  X,,  ^  „.  is  stochastically  larger  than  X,,  ^  ..  . 
(X,J+2)  •'    *        (X,J+4) 


7 

Furthermore,  by  varying  the  E,*     under  the  restriction  X  =  IE,.   /2a   ,   the 

i=l 

rest  of  the  preliminary  test  estimator,  E(p-3) ' (3-§) ,  can  assume  any  value 
from  (3.16b)  to  (3.16a).  There  is  only  one  point  for  each  equation  for  which 
the  value  of  the  right  side  is  a   trS"  ,  the  risk  of  the  conventional  estima- 
tor,  (3.16a)  and  (3.16b)  are  equal  when  A  =  0. 

The  characteristics  of  the  risk  functions  (3.16a)  and  (3.16b)  are  re- 

2 
fleeted  in  Figure  1,  for  the  situation  where  a  =  .05,  a  =  1,  J  =  2,  T-K  »  10, 

E  dp^  =  1,  t,  =  .9.  t„  =  .1  and  E(b-3)'(b-6)  =  2. 
i=l  ^        ^       ^  -  -   -  - 


4.  Comparison  of  the  Risk  Functions 

a.   Conventional  and  Preliminary  Test  Estimators 

We  now  wish  to  determine  conditions  under  which  the  risk  function  of  the 

/\ 

/\ 

preliminary  test  estimator,  3,  is  less  than  or  greater  than  that  of  the  un- 
restricted least  squares  estimator,  b,  in  terms  of  X.  Subtracting  (3.15) 
from  the  risk  function  for  b,  (2.4),  we  have 


(4.1)     E(b-8)'(b-e)-E(S-6)'(6-6)  =  (  2  dp^)o^[h,  (2)+2(h,  (4)-2h,  (2))  Z  t. 
--------     .^j,  1       X       X      X    .^j  1 

For  a  given  value  of  X_,  the  risk  function  least  favorable  to  the  prelim- 
inary  test  estimator  is  equation  (3.l6a),  where  ?*  =  2a  X.  Therefore,  for 
a  fixed  X,  the  smallest  possible  value  of  (4.1)  is 

(4.2a)    (_Z  d.^^^)a^[h^(2)  +  2(h^(4)-2h^(2))tj^X], 

and  the  risk  of  the  conventional  estimator  b  is  at  least  as  large  as  that  for 
the  preliminary  test  estimator  6  if 


rr*2' 
i 


2c' 
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Figure   1. 

Risk  Functions  for  the  Conventional 
Restricted  and  Preliminary  Test  Estimators, 
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(4.3a)    \     <  h,l4T~  . 

Alternatively,  the  risk  function  most  favorable  to  the  preliminary  test  esti- 
mator  is  equation  (3.16b),  where  F*     =  2o  \. 

For  a  given  value  of  X,   the  largest  possible  value  of  (4.1)  is 

(4.2b)     (  E  d{^^)a^[h^(2)+2(h^(4)-2h^(2))t2X], 
i=l  '' 

which  lies  above  (4.2a)  for  every  X.  Making  use  of  (4.2b),  the  condition  for 
the  risk  of  b  to  be  less  than  or  equal  to  that  of  6  is 


(4.3b)    X  >         h,  (4)    . 

Since  h, (4)  and  h,  (2)  are  complicated  functions  of  X,  it  is  difficult,  in 

general,  to  solve  for  the  equality  of  the  risk  functions  involving  t. ,  i.e., 

1 

for  X  such  that  f(X  )  =  0,  where  f(X)  =  X  -        h, (4)   ,  We  do,  however, 
°  °  2t  r2  -  — 1 

know  that  X^  >  l/4t,  ,  since  .  .,.-  >  0,  and  X  >  -^  ,_ =-  if  T-K  >  2,  since 

0  -     L'       n.(2)  -   '      0  -  2t,  (2-0}  )        -   * 

X  L    0 


.  v-.  >  0)^,  where  u 
h- (2)  -  0*       0 


T-K 
+  c 


3/ 


Correspondingly,  the  same  type  of  reasoning  applies  to  finding  the  equal- 
ity of  the  risk  functions  involving  tg,  i.e.,  for  X^  such  that  g(X.)  =  0, 


where  g(X)  =  X  -        h^(4)  ,  with  g(X)  <  0  if  X  <  X^.       Using  tg. 


2^st2  -  h^l 


3/ 

—  See  Theorem  1  in  Appendix  B. 


■16- 


h  (4j 
since  ,  ,^i   <   1,  the  risk  for  the  preliminary  test  estimator  is  less  than 
^    - 

that  of  the  conventional  estimator  if  X,  < 


1  -  2t3 

Therefore,  the  equality  of  the  risk  functions  of  the  preliminary  test  and 
unrestricted  least  squares  estimators  have  the  following  bounds: 

Lt  O 

with  the  lower  bound  replaced  by  ■-■  .^ c-  if  T-K  >  2. 

In  order  to  depict  this  situation  graphically,  the  case  that  formed  the 
basis  for  Figure  1  is  used  and  the  neighborhood  of  the  origin  is  enlarged 
in  Figure  2  in  which  the  actual  values  of  X  and  X,  sy^   identified. 

As  a  special  case  if  the  characteristic  roots,  d.   ,  are  the  same  (for 
example,  X'X  is  a  scalar  matrix)  and  thus,  ^i  -  ^i  ~   '"'   ~  ^J'   ^^®"  ^L  ^^^ 
t_  equal  1/J,  and  (3.16a)  is  equal  to  (3.16b).  Under  this  situation,  the 
conditions  for  the  preliminary  test  estimator  to  be  less  than  or  equal  to 

the  conventional  estimator  are 
C4.3d)    ^  <  X^  =  X^  <  |, 

T 

where  the  lower  bound  is  replaced  by  -^-r^ — r-  if  T-K  >  2.  This  result  is  con- 

2(2"aj^J 

Sistent  with  that  derived  by  Sclove,  et^  al^.  (1970)  for  the  orthonormal  re- 

gressor  case. 

We  may  siimmarize  the  conclusions  to  this  point  as  follows: 

If  X  ^  ^^o''^!'^  or  X  ji  (tt— »  27"^  ^^^  ^  ^^   known,  we  can  decide  if  (4.1) 

is  positive  or  negative.  Furthermore,  even  if  X  is  known  and  X  e  (X  ,X,)  with 

X  ^   X, ,  one  cannot  determine  whether  or  not  the  risk  function  of  b  exceeds 

J  t.q' 

that  of  8  without  knowing  the  value  of  I   i—  . 

i=l  2a 
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Figure  2. 

Risk  Functions  for  the  Conventional 
Restricted  and  Preliminary  Test  Estimators 
in  the  Neighborhood  of  X  »  0. 
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The  risk  of  §  at  the  origin  where  X  =  0,  a  consequence  of  6  »  0,  is 


a^trS'^  -  Pr 


fx 


(J+2)  ,   cJ 


T-K 


'(T-K) 


i=l  ^ 


which  is  smaller  than  o^trS"^,  the  risk  of  b.  This  can  be  seen  from  (3. IS) 

since  X  =  0  implies  -^  =  0,  i  =  1, . . .  ,J.  At  X  =  0,  the  risk  function  (3.16a) 

2a 

is  equal  to  (3.16b). 

Alternatively,  since  Xh,  (il)  and  hj^(Z)  approach  0  as  X  approaches  «», 
(3.15),  (3.16a)  and  (3.16b)  approach  o^trS"  as  X  -►  «.  This  implies  that  the 
risk  of  3  approaches  that  of  b  from  above  as  X  -^  «  (see  Figure  1) . 

Finally,  it  should  be  noted  that  the  terms  h, (2)  and  h, (4)  are  the  proba- 
bilities of  ratios  of  random  variables,  being  less  than  a  constant  and  they 
depend  on  the  critical  value  c  or  the  level  of  the  test  o.  Therefore,  as 
a  -^  0,  h>  (A)  -*"   1  and  the  risk  of  the  preliminary  test  estimator,  §,  approaches 
that  of  the  restricted  least  squares  estimator,  B,  since  in  a  repeated  sampling 
context,  %   is  used  more  frequently  as  an  estimator  of  6  for  all  X.  Altema- 
tively,  as  a  ->  1  and  h,  (£)  -*  0,  the  risk  function  for  §  approaches  that  for 
the  conventional  estimator,  b. 

b.   Conventional  and  Restricted  Estimators 

To  facilitate  a  comparison  of  the  risk  of  the  equality  restricted  least 
squares  estimator,  3,  with  the  conventional  estimator,  b,  we  note  from  the 
derivation  given  in  Appendix  C  that 


(4.4) 


E(3-3)»(3-3)  = 


a^trS-l 


i=l  '     i=l  ^   ^ 


a^trS-l 


a2  i  dP)  .  <P- 
i=l  ^ 


I   d 


(1) 


J 

[2  Z  t. 
i=l  ^ 


..*2i 


2a' 


]. 
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Written  in  this  form,  it  is  clear  that,  in  general,  (4.4)  is  not  just  a  func- 

J 


tion  of  X,  but  one  must  know  the  ^.     which  appear  in  E  t. 

i=l 


',f 


2a 


As  a  ,counte::- 


part  for  (3. 17a)  and  (3.17b),  for  a  fixed  value  of  X, 

J 


'^     -^         2-1    2     fll     2 
(4.Sa)  ■  E(e-B)'(e-6)  <  a  trS  ^  -  a^  I  dr^  *  a^ 

, i=l 


and 


'  T 

^       ■     ^  2-1    2     fl"J     2 

(4.5b)    E(0-6)'(S-6)  >  a'^trS   -  a^  Z  dr-"  +  a^ 

1=1  ^ 


Z  d) 


(1) 


2tsX, 


Zd> 
j«l  ^ 


(1) 


zt^x. 


Thus,  the  risk  function  (4.4)  can  assume  any  value  from  (4.Sb)  to  (4.5a),  by 

2  ^     ^f 

varying  the  5*  's,  under  the  restriction  that  Z    — =•  =  X.  Making  use  of 

^  i«l  2a^  J 

(4.4)  or  (4.5a)  and  (4.Sb),  when  X  =  0,  the  risk  of  6  =  a^trS"'^  -  a^  I  d^^^ 

i»l  ^ 

>  0.  Also,  as  X  goes  to  infinity,  the  expressions  on  the  right  side  of  (4.Sa) 
and  (4.5b),  and  thus  (4.4),  go  to  infinity.  The  characteristics  of  the  risk 
function  for  tg  and  t^^  in  (4.Sa)  and  (4.5b),  respectively,  for  the  example 
given  in  Section  3c,  are  given  in  Figures  1  and  2. 

Making  use  of  (3. 2b), and  comparing  the  risk  of  B  and  b,  we  have 


2^ 


]. 


(4.6)     E(b-8)'(b-6)-E(3-0)'(3-B)  =  a^  Z  d^^\l-2  Z  t.  -X 
i=l  ^     i=l  "-ha' 

*2  ^     ^-^ 

By  varying  ^1    where  X  =  Z  -~   ,  (4.6)  may  assume  any  value  from 

i=l  2a^  . 


(4.7a)    a^  I   df^^[l  -  2t,X] 
i»l  ^        ^ 


to 


-■*>»..» 


(4.7b)    a^   Z  d.^^^[l  -  2t„X] ,  for  fixed  X. 
i=l  ^        ^ 


Therefore,  (4.6)  will  be  non-negative  if 


■20- 


(4.8a) 


X  < 


1 
2t, 


and  (4.6)  will  be  non-positive  if 

1 


(4.8b) 


X  > 


2t, 


If  t,,  i^   t,  and  X  e  (^>r— •  -oT— )  >  one  cannot  determine  the  sign  of  (4.6)  even 
b    L         ix^     2tg 

if  X  is  known  precisely.  Thus,  as  in  the  case  of  the  preliminary  test  esti- 


mator, it  is  necessary  to  know  the  value  of  E  t. 

i=l  ^ 


(rr*2\ 


2a 


c.   Preliminary  Test  and  Restricted  Estimators 

By  making  use  of  (3.16)  and  (4.4),  the  difference  between  the  risk  func- 
tions  for  §  and  §  may  be  expressed  as 

(4.9)     £(§-§)• (§-§)-£(§-§) •(§-§)  = 

,  J 


,(1) 


(  I   dj"^)[l-h^(2)-2(l+h^(4)-2hj^(2))(  S  t^ 
i=l  i=l 


-•  *2^ 


2a 


)]. 


Proceeding  as  before,  for  a  given  value  of  X,  (4.9)  may  assxime  any  value  be- 
tween 

(4.10a)   a2(  I   d{^^)[l-h^(2)-2(l+h^(4)-2hj^(2))(Xtj^)] 
i=l 

and 


(4.10b)   a^  E  dP^)[l-h^(2)-2(l+h^(4)-2h^(2))(Xts)]. 
i=l 

J  C^ 

if  we  let  the  values  of  the  ?*  vary  under  the  restriction  that  Z    — s-  =  X, 

^  i»l  2a 

Now  l-hj^(4)  >  l-h^(2)  implies  l+h^(4)-2h^(2)  >  0  and  the  difference  in  the 
risk  functions  given  by  (4.10a)  (and  thus  (4.9))  will  be  non-negative  if 


1 


(4.11)    X  < 


(l-h^(4)) 
2Ht2  -  a-h^(2))^ 


-21- 


l-h^{4) 
Since  ^_,     .^^  ^  ^>   ^^is  means  that  (4.11)  is  satisfied  if 

A 

(4.12)    X,  <   ^ 


2  -  2tL  • 
Also,  (4.10b)  (and  thus  (4.9))  will  be  non-positive  if 

J*2  a-\(^))  x]j.  J 

If  w^  <  3^,  then  [2  -  (_^.-^\2))'^   -  P^t-2    -  T^^ '  ^^   ^^"=  ^^''^^'^   ^^. 

^         ^(T-K) 

satisfied  if 


(4.14)    \^     >  ^2  . 

'     2tsPr[-^>^] 
^(T-K) 

The  inequalities  (4.12)  and  (4.14)  are  helpful  because  it  is  difficult 
to  solve  for  X-  such  that  (4.11)  holds  for  X  <  X-  and  X-  such  that  (4.13) 
holds  for  X  >  X_,  Therefore,  the  equality  of  the  risk  functions  of  the  pre- 
liminary test  and  restricted  least  squares  estimators  have  the  following  bounds 

1 
1 


(4.15)  <  X2  <  X3  <        2  . 

2t3Pr[-I^  >  f^l 
^(T-K) 

As  before,  if  t^  =  t2  =  ...  =  tj,  then  (4.9)  =  (4.10a)  «  (4.10b)  and 

X2  =  X_.  This  implies  t.  =  y,  for  i  =  1,...,J,  and  thus 

(4.16)  ^  <  X-  =  X-  < 


2     "2  -^3         2 

^(T-K) 


-22- 


If  X  ^  (Xo,Xt)  or  X  ^  (^■^,   = '■ )  and  X  is  known,  we  can  decide 

.   S  ^^2     -  T-K^ 
(T-K) 

if  C4.9)  is  positive  or  negative.  However,  even  if  X  is  known  and  X  e  CX2,X,) 

with  X-  /  X,  (which  occurs  if  tg  f   t  ),  one  cannot  determine  whether  or  not 

the  risk  function  of  3  exceeds  that  of  §  without  knowing  the  value  of 

J  X.C^ 


1=1  2a^ 


For  X  =  0,  the  difference  in  the  risk  functions  of  §  and  §  (4.9)  is 


J        X^ 
a^(  Z  dP^)Pr[  i^*^^   >   ^nrl  >  0.  Furthermore,  as  X  goes  to  infinity,  the 
i=l  ^     X 

right  side  of  (4.9)  goes  to  minus  infinity  because  Xh, (i)  and  h,  (i)  go  to  zero 
so  (4.10a)  and  (4.10b)  go  to  minus  infinity  (see  Figure  1). 

5.  Optimal  Choice  of  a 

For  X  such  that  0  <  X  <  l/4tj ,  the  risk  for  the  preliminary  test  estima- 
tor is  smaller  than  the  conventional  estimator  regardless  of  the  choice  of 
the  level  of  statistical  significance,  a,  or  the  critical  value,  c.  However, 
the  choice  of  a  or  c  does  affect  the  magnitude  of  the  difference  between  the 
risk  functions  that  result  for  each  X. 

When  a  approaches  zero,  the  critical  value  c  approaches  infinity,  and  the 
risk  function  for  the  preliminary  test  estimator  approaches  that  of  the  re- 
stricted  least  squares  estimator  8.  Alternatively,  when  a  approaches  one  the 
risk  function  for  the  preliminary  test  estimator  approaches  that  of  the  con- 
ventional estimator,  and  the  difference  in  the  risk  functions  tend  to  zero. 
A  graph  of  the  risk  functions  for  two  levels  of  a  is  given  in  Figure  3. 
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Since  we  have  expressed  the  conditions  for  the  risk  of  the  conventional 
estimator  to  exceed  that  of  the  preliminary  test  estimator  in  terms  of, the 
non-central ity  parameter,  X,  of  the  non-central  F  distribution,  we  could  ■ 
follow  Toro-Vizcarrondo  and  Wallace  (1968)  or  Wallace  (1970)  and  propose 
a  test,  for  example,  for  the  orthonormal  regressor  case  for  the  hypothesis 
X  <  J  against  the  alternative  X  >  ^.  For  this  test,  the  investigator  com- 
putes the  value  for  the  test  statistic  and  rejects  the  hypothesis  if  this 
value  exceeds  tlie  critical  value  based  on  an  a   and  the  non-central  F  with 
J  and  T-K  degrees  of  freedom  and  a  non-centrality  parameter  of  j.  However, 
it  does  not  matter  whether  one  uses  the  test  statistic  with  the  central  F 
(F„  „  ,  x-o-^ '  ^^®  °"^  originally  proposed  by  Toro-Vizcarrondo  and  Wallace 
(F»._„  .  X-— -^ '  °^  ^^®  ^'^^  suggested  above,  since  the  critical  points  for 
these  tests  can  always  be  matched  up  by  varying  a  for  one  test  versus  another. 

Since  in  reality  X  is  unknown  and  the  gain  or  loss  for  the  preliminary 
test  estimator  varies  with  the  choice  of  c,  we  are  faced  with  a  decision  prob- 
lem, where  the  optimum  critical  value,  c,  or  the  level  of  statistical  signif- 
icance, a,  depends  on  the  optimality  criterion  used.   If  we  are  interested  in 
a  choice  of  c  or  a  which  would  minimize  the  maximum  risk,  a  minimax  solution 
is  c  =^  0.  Given  this  trivial  result,  one  alternative  is  to  use  the  minimax 
regret  in  the  class  of  preliminary  test  estimators.  Some  results  have  been 
obtained  in  this  area  for  the  risk  function  criterion,  by  Sawa  and  Hiromatsu 
(1970),  for  the  special  case  when  J  =  1. 

Alternatively,  if  one  could  specify  a  tractable  prior  probability  density 
function  for  X,  say,  for  example,  a  uniform  or  a  chi-square  distribution,  then 
a  Bayesian  extension  of  these  results  is  possible.  Relative  to  (4.1)'  for  a 
uniform  density  for  X  on  [0,<»),  the  integral  exists  and  is  finite.  When  a  = 
1,  i.e.,  w^  =  0,  the  integral  is  0,  Of  course,  when  a  =  0,  i.e.,  w  =  1,  then 
the  integral  is  infinite  because  estimator  Q   is  always  chosen. 
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6.  The  Bias  and  Covariance  Matrix  of  3 

It  is  well  known  that  unrestricted  estimator  b  is  unbiased  with  a  covar- 

2-1  -^ 

iance  matrix  a   S  .  The  restricted  estimator  6  has  mean 

(6.1)  '   E§  =  §  -  S''^R'CRS'-^R')"-^6, 

and  if  6  ?i  0,  §  is  biased  with  bias  S"  R'(RS"  R')6.  Furthermore,  §  has  a 
covariance  matrix, 

(6.2)  E(§-E§)C§-E§)'  =  a^[S"-^-S"'^R'(RS"-^R')"-^RS''^]. 

The  mean  of  the  preliminary  test  estimator, 

(6.3)  E§  =  EI^Q^^j(u)|*EI^^^„3(u)b, 

is  evaluated  in  Appendix  D  and  gives,  from  (D.4), 

(6.4)  E§  =  §  *  h^(2)S"-^R'(RS"'^R')'^5. 

If  6  =  0,  the  preliminary  test  estimator  is  unbiased.  Otherwise,  its  bias 
depends  on  (i)  the  probability  of  a  random  variable  with  a  non-central  F  dis-  , 
tribution  being  smaller  than  a  constant  determined  by  the  level  of  the  test 
and  the  number  of  restrictions,  J,  as  well  as  the  incorrectness  of  the  restric- 
tion through  the  non-centrality  parameter,  X,  (ii)  the  incorrectness  of  the 
prior  information  through  6,  and  (iii)  the  matrix  S'  R' (RS  R')   .  Thus,  the 
bias  is  always  as  small  as  that  of  8  given  in  (6.1). 

The  covariance  matrix  of  B  is  derived  in  Appendix  C  and  is  given  in 
(D.12)  as 

«  ^  .  .     2-1-1   fhxC2)a"I  *[2h^(2)-h^(4).h^(2)]n^n;  0 
(6.5)    E(B-E6)(8-E6)'  =  a^S  -P  ^Q' 

where  P,  Q,  h,(2),  h,(4)  and  n^  were  defined  in  Section  3. 


Q(P"^)'. 
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By  using 


'j' 


,0  0, 


=  QPS''^R'(RS"'^R'r'^RS"-^P'Q' 


and 


so 


n  =  QP(B-S"-^R'(RS"-^R')"-^r), 


P'^^Q' 


■  0  0 


Q(P'-^)'   =  S"-^R'(RS"^R')"^RS"'^ 


and 


0   0 


QPS'-^R'  CRS'-^R')~-^RS"-^P'Q'QP(§-S"^R'  (RS"^R')''^r) 

•(r'(RS"-^R')"-^RS"-^-0)P'Q'QPS"^R'(RS'^R')"-^RS"'^PQ, 


(6.5)  can  be  expressed  as 

(6.6)  E(§-E§)(6-E6)'  =  o^S'^   -  a^h^(2)S"'^  -  [2hj^(2)-h^(4)+h^(2)]S"^R' 

•(RS"^R')'^66' CRS"-^R')'^RS"^. 

ft 
Hence,  the  variance  of  6  depends  on  the  variance  of  b,  the  error  in  the  re- 
striction, 6,   and  probabilities  which  are  associated  with  the  chance  of 
accepting  the  hypothesis  that  X  <  X     and  using  the  restriction  matrix,  R. 

7.  The  Risk  for  the  Preliminary  Test  Estimator  2 

In  addition  to  the  risk  function  for  the  preliminary  test  estimator  6, 
one  might  be  interested  in  the  quadratic  loss  for  conditional  mean  forecast- 
ing  and  thus  the  risk  function  for  the  estimator  ^  ~  ^1-  ^°^  °^^   case,  this 
implies  a  risk  function 

(7.1)    E(XB-)C6)'(XB-xe)  =  E(6-6)'X'X(B-6), 


which  weights  the  elements  in  the  quadratic  form,  E(g-6)'(6-6)  with  elements 
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from  the  cross  products  matrix,  X'X,  Wallace  (1971)  considered  arisk  func- 
tion  of  the  form  (7.1)  for  the  restricted  estimator  3  largely  because  of  the 
simplification  it  produced  in  the  tables  required  for  testing  hypotheses  about 
the  non-centrality  parameter,  X.     These  same  simplifications  occur  for  the 
preliminary  test  estimator. 

The  value  of  risk  function  (7,1)  can  be  developed  using  the  methodology 
of  Section  3  with  only  minor  changes.  Equation  (3.7)  from  Section  3  becomes 


-1    -1 
(7.2)     Q(P  ^)'SP  Q'  =  A* 


""a*  a*'' 

^1  ^3 
A*'  A* 


I, 


since  (P"  )'SP"  =  I  and  Q  is  an  orthogonal  transformation. 

As  a  consequence  of  the  weighting  pattern  in  risk  function  (7.1),  the 
criterion  for  the  risk  function  for  b  to  exceed  that  of  §  is 

(7.3)  X  <  i.2  ji-  =  3-. 

1=1   L 

and  for  the  risk  function  of  §  to  exceed  that  of  b  is 

(7.4)  X  >  i-  Z   ^      "^ 


since  d,,...,d-  are  all  ones.  Thus,  in  this  special  case,  which  is  in  line 
with  the  results  of  Section  3b,  the  minimvim  value  for  X,  which  is  small  enough 
to  insure  that  the  risk  function  for  the  unrestricted  estimator  is  less  than 
that  of  the  unrestricted  estimator,  reduces  to  a  single  value,  j  .   It  should 
be  noted  that  if  one  assumes  the  orthonormality  of  regressors,  this  is  equiva- 
lent to  taking  (7.1)  as  the  risk  function.  The  extension  of  the  orthonormal 
regressor  results  to  the  general  case  is  not  a  direct  one  as  the  measure  of 
goodness  is  changed. 
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8.   Concluding  Remarks 

Using  a  quadratic  risk,  function  to  measure  estimator  performance,  a  test 
procedure  proposed  by  Wallace,  and  the  methods  for  evaluating  the  preliminary 
test  statistic  developed  by  Stein  and  Sclove,  we  have  investigated  the  pro- 
perties of  the  preliminary  test  estimator  for  the  standard  regression  problem 
in  which  both  sample  and  exact  linear  prior  information  or  hypotheses  about 
the  parameters  is  utilized.   In  particular,  a  preliminary  test  estimator  is 
evaluated  which  permits  the  investigator  to  utilize  both  sam^^le  and  exact, 
though  slightly  incorrect,  information  to  improve  the  estimates  when  judged 

by  a  quadratic  risk  function  criterion  over  certain  regions  of  the  linear 
hypothesis  space.  The  mean  and  variance  of  the  preliminary  test  estimator 
is  specified  and  the  condition  for  which  this  estimator  is  better  than  the 
conventional  estimator,  in  a  quadratic  risk  context,  is  derived  in  terms  of 
the  non-centrality  parameter,  X,  of  a  non-central  F  distribution.  It  is 
shown  that  in  order  for  the  preliminary  test  estimator  to  be  superior  to  the 

conventional  estimator  that  X  <  -r  E  -^ <  -r,  which  contrasts  with  the 

-  4  .  -  d,   -  4 
1-1  L 

result  found  by  Wallace  when  comparing  the  risk  functions  of  the  restricted 
and  unrestricted  estimators.  When  the  risk  functions  for  the  orthonormal 
regrassors  and  conditional  mean  forecast  cases  are  compared,  the  condition 
for  the  preliminary  test  estimator  to  be  superior  is  that  ^  f  x  • 

The  choice  of  the  level  of  the  test,  and  hence  the  critical  value,  con- 
ditions the  relative  gain  or  loss  accruing  to  the  preliminary  test  estimators 
for  various  values  of  X  and  the  d. .  At  this  stage,  the  choice  of  an  optimal 
critical  point,  c,  that  would  satisfy  some  criterion  or  lead  to  an  optimum 
decision  rule  remains  to  be  resolved.   Inasmuch  as  the  advantage  of  the  pre- 
liminary test  estimator  over  the  usual  unrestricted  estimator  occurs  when  X 
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1  '  4'^ 

is  confined  to  an  interval  (0,-7-  Z  -; ),  a  Bayesian  analysis  in  which  a 

1=1  L 

prior  distribution  is  placed  on  X   seems  to  be  one  natural  extension  of  this 
work.  As  noted  in  the  introduction,  Sclove  et  al_. ,  in  an  unpublished  paper, 
have  studied  estimation  preceded  by  testing  for  the  orthonormal  regressors 
case  and  reach  conclusions  compatible  with  those  we  have  derived  for  the  gen- 
eral model.  In  addition,  it  has  been  shown  that  for  comparable  values,  of  c 
and  for  K  or  J  greater  than  2  and  for  0  <  c  <  2(K-2) ] (T+K) ,  the  Stein-James 
(1961)  positive  part  estimator  strictly  dominates  the  preliminary  test  esti- 
mator. These  results  should  carry  over  for  the  general  case  and  what  remains 
to  be  done  is  to  contrast  the  risk  functions  for  the  two  estimators  for  non- 
comparable  critical  values  c. 

Another  line  of  inquiry  would  be  to  alter  the  exact  constraint  into  a 
stochastic  constraint  and  following  the  work  of  Theil  and  Goldberger  (1961) 
and  Theil  (1963),  develop  a  preliminary  test  estimator  for  that  model.  A 
third  line  of  inquiry  is  to  consider  as  a  criterion  a  matrix  of  risk  functions 
rather  than  a  risk  function  and  thus  extend  the  work  of  Toro-Vizcarrondo  and 
Wallace  (196S) .  The  authors  are  developing  these  lines  of  inquiry  in  other 
papers  at  the  present  time. 
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Appendix  A 
Some  Theorems  and  Lemmas 


We  now  turn  to  Lemmas  1  and  2  to  be  used  in  Theorems  1  and  2. 
Lemma  1 .   If  the  random  variable  u  is  N(e,l),  then 

EI,_  ^s(u-)u^  =  Pr(X-.2    <  c)  +  e2pr(X^.2    <  c) . 
^°''^  CV'3)  (|-.5) 

■         2  2 

Proof:   If  u  is  ■^^(9,1),  then  u  is  distributed  as  non-central  X  a2   •  Thus, 

2  2 

u  is  distributed  as  central  X^,  -,„v  where  H  is  a  random  variable  with  a 

q2 


Poisson  distribution  with  parameter  -=- 
The  expectation, 


^^(O.c)^"')"'  =  ^t^fl(0,c)(^h)^hl"=^"  =  Jo^[IcO,c)^V^h^- 


2 


h! 


2 
where  t,  is  distributed  as  X..  j,.. 


3-^2h  ,  -t 

c 1  — 

;   t  ^    e  ^dt 


rQ2^h 


2  ^\  r 


-} 


3+2h 


h! 


5-1- 2h  ,  -t 
0  e  ^ 


3+2h  ,  -t 
c  -2—  -1_  2,^  -6^^2^h 


e  dt 


X 


h=0 


3+'2h 


3+2h 


h! 


-(—  +  2  E  - 
h=l 


3+2h 


3+2h 


(h-1)! 


Pr(X^  .2    <  c)  +  e^Pr(X^.2    <  c) 


=  ^^O.C)'^^  e^    ^  "  ^"El^n  ..(X  a2    ) 


(^,3) 


^°'^^   (^,5) 
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which  was  to  be  shown. 

The  following  proof  is  a  variant  of  one  used  by  Stein  (1966)  to  obtain  a 
similar  result. 


Lemma  2.   If  u  is  NCe,l),  then 
EI 


,-  .(.u^)u  =   eEI,.  ,^(X^.2   )  =  ePr(X^.2    <  c). 
(0,c)  (0,c)   ^^^3^         ^^^33 


2  2 

Proof:  If  u  is  ^(6,1),  u  can  be  expressed  as  a  central  Xq^-hI  ^h®^®  ^  ^^ 


distributed  as  Poisson 


(-2,    e 
EI(o^,)(u)u 


The  expectation 


-ii 

2    " 


2   (^  *Q") 

/2.  i     ^O.c)t")"«        ^" 


2  .     "       ~   (^  +9u) 


■9' 


6' 


-^  '  le^«'^^^(o.c)f"'))> 


-9' 


/'c,2^h 


60  j^^Q  h!     (C.g)'  (l+2h) 


=  9e  ^  {  E   '  ^ 


2^h-l 


^^C0.c)^^(3+2rh-l))^-^ 


^1  (h-1)!    '■C0,c)^'^C3+2(h-l)) 


ePr(X%,2    <  c) 
(— ,ij 


Q.E.D. 


Using  Lemma  1,  we  have: 


Theorem  1 :  If  the  (J  x  i)  vector  u  is  distributed  as  N(9,I),  then 

(^:-^,J+2)         (^^.J+2) 
2  2 


■32- 
J 


2 
Proof:  Let  u  =  (u,,...,u-)'  so  u'u  =  Z  u..  Conditioning  on  the  u. 's, 

■  "       j=l  J 


^f^o,c)fy'y^y^  -'^^^t^ti(o,c-  zu2)("i)"ii"j'^^i 


>  •  •  •  » 


E[E[I  j_j        (uj)Uj|u.,j?*J]]}'. 


which  by  Lemma  2  gives 

E[I(o;^^(u'u)u]   =    {e^E[I  J  (X\,        )],...,9jE[I  j_^        (XV        )]] 

CO.c-   Z  up      C-7-,3)  (0,c-   Z  up      (4.3) 

j=2  ^  ^  j=l  ^ 

J  J-1 

(—3)    J  =  2  C-f.3)    ^"^ 

Now,  since  the  sum  of  independent  variables  with  non-central  chi  square 
distributions  has  a  chi  square  distribution  with  a  non-centrality  parameter 
which  is  the  sum  of  the  non-centrality  parameters  for  the  variables  summed 
and  degrees  of  freedom  equal  to  the  sum  of  the  degrees  of  freedom  of  the  indi- 
vidual variables, 

C^,J*2)         C^,J+2) 

.   Q.E.D. 
Using  Lemma  1>  wfc  have: 

Theorem  2:   If  u  is  a  (J  x  l)  vector  distributed  as  N(6,I)  and  A  is  any 
positive  definite  symmetric  matrix,  then 

(■=2^,J+2)  C^.J+4) 

=  Pr(X^g,g     <  c)trA  +  Pr(X^Q,Q     <  c)6'A9. 
(^,J+2)  (^,J+4) 
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Proof:  Let  P  be  an  orthogonal  matrix  such  that  PAP'  =  D  = 


dj  0  ...  0 


J^ 


where 


the  d.  >  0  are  the  characteristic  roots  of  A.  Define  the  (J  x  1)  vector  w  = 
Pu.  So  w  is  distributed  N(Pe,I).  We  then  have 


E[I^Q^^^.(u'u)u'Au]  =  E[I(o,c)^-'-^^'^^ 


2^  2, 


l   d.E[E[I,.    „   A»-)^    \v-,i^iU, 


which  by  Lemma  1  can  be  expressed  as 


E  {d.E[I 
1=1  ^ 


C0,c-Zw2)CX^,:e)2   )]*(p;§)'B[I(o,e.  Ew?)Cx'(p!e)2 

jH  ^      (-4^— ,3)  j^i  ^      c-4^-. 


2   )]}. 
5) 


where  g.'  is  the  i   row  of  P.  Therefore, 

J 


E[ 


ico.c)^^'^^^'^^^  =.^^Vf^o,c)^^  e-e    5^  -^  ^t^o,c)^^  9-9    )3> 


=  Pr(X^9,0     <  c)trA  +  Pr(X^Q,g     <  c)e'A6 


C~^,J+2) 


C^,J+4) 


Q.E.D. 


A  theorem  useful  in  evaluating  the  covariance  matrix  of  B  is 


Theorem  5:   If  the  (J  x  1)  vector  u  is  distributed  normally  with  mean  vector  9 
aiid  covariance  matrix  I  of  order  J,  then 

^-^0,c)ty''^)y'^'  =  ^f^(0.c)t>^(X.J*2))J^J  *  ^f^0.c)<,J*4)^J§^'' 
where  X  =  6'e/2. 

Proof:  Let  u  =  (u, , . . . ,u,) ' ,  and  determine  the  diagonal  and  off -diagonal 
elements  of  EI,-  „-)(y'u)yy'-  The  diagonal  elements  are  of  the  form 
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E[I,_  ,(  Z  u^)u^]  =  E[E[I,.  ,(u^)u^|u^,j/-i] 


2 
(by  Lemma  1  and  letting  c*  =  c-  Z  u.) 

2r^ 


2        r.   2,T     o2rrT       r,2 


=  E[I 


(^,3)  ^^^  C-4'5)  ^'^^ 


2'^  ^2 

The  off-diagonal  elements,  for  i  i^  j  >  have  the  form 

=  E[u  0.E[I  CXe'a        )lu,,k^i]] 

U..3) 
2 

2 

by  Lernina  2  and  where  c*  =  c-   Z  u..     Furthermore, 

(-^.3)  '    e^,3)g; 


^       (  E  ^,3+J-2) 
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2      2 
Now  interchanging  u.  and  Xva2     ,  we  have 


e.E[E[I,-    2         (u^)u.|x^  ^2     ]] 


(  E  -2-,J+l)       d.  2  ,J+1) 

by  Lemma  2.  The  unconditional  expectations  of  the  off-diagonal  elements  are 

(E  -2-,J  +  l)  i-J.S) 

•^  2 
where  X  =  Z  9  /2. 

i=l  ^ 
Combining  the  diagonal  and  off-diagonal  components,  the  matrix  may 

be  written  as 


Q.E.D. 
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Appendix  B.  Properties  of  Functions  of  the 
Non-Central  F  Distribution 


In  this  Appendix,  a  theorem  is  developed  which  permits  the  evaluation  of 
regions  where  the  non-centrality  parameter,  X,  of  the  non-central  F  distribu- 
tion  is  either  small  enough  to  insure  the  risk  function  for  3  is  less  than 

/\ 

that  of  b  or  large  enough  to  insure  that  the  risk  function  of  §  is   larger 

than  that  of  b. 

rf      * 

Theorem  1 :      Let 


h,W     -=^nx\^^,^^y'^l^.^^     <     cJ/T-K) 


and 


w 


T-K 


+  c 


If  T-K  >  2,   then  h^(4)/h^C2)   >  w^, 


Proof: 


f^-1^         '^t^(X,J.^)/^rT-K)  5  fq<) 


2   e  ^  I^Pr 
k=0         '^' 


^(J-f£f2k)       cJ_ 

y2  -  T-K 

^(T-K) 


where 


Pr 


'^(J-f)l-*-2k)       cJ__ 
2  -  T-K 

"■^(T-K) 


cJ_ 
T-K 


J+il+2k-l 


0 


w 


dw 


J+il+2k+T-K 


P( — ^ ,  -=-)(l+w) 


1/ 


cJ 
Integrating  the  density  by  parts  and  defining  c  =  =— jr  yields 

0    1  —  K 


-'^Lindgren  (1968),  p.  380. 


(B.2) 
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J+Jl+2k-l 


,J+2,+2k,        2 
2         /o       ^-T— ^^ 


dw 


J+il+2k 


J  +  )i+2k+T-K   , 


fjtii^.  l^)(i*w)      ^ 


J^+2k 


» 

J+£+2k 

2 
w 

6(i 

2 

2k 

j+;i4 

2k+T- 

■K 

.  ^D(i-w) 

2 

which  can  be  written  as 


J^-Jl4-2k 

-^+Z+2k+T-K,,,     2  ,, 

g         2(- s ^jw  dw 


J+£,+2k+T-K     ' 


^J*|*2k3g^JU*2k^   TMC^^^^^j 


J+Jl+2k 


(B.3)         Pr 


2 
(T-K) 


2      t  =      t 

k4  (k+l). 


J+il+2k+T-K   ' 


„.J+£+2k  T-K.  ,,        . 
6(— .-2-^(1*Cq) 


Using  (3.3)  recursively. 


00 


J+£+2j_ 

c  ^    (   ^  ) 
_o M-^£f2j-^ 


k+y     j=k 


P(  ^  ^,-~-)Cl+c^) 


J-^£-t-T-K4-2j 


since  lim  t,  ,  =  0. 

Letting  the  previous  definitions  of  c  and  w  mean  that  c 

0       0  0 

I 
. — 1_  =  1..W 

1>C_       0* 


w. 


1-w. 


and 


3+1*2) 


T-K 


(B  41     t    =  Z  -^^^^^j  0 


w-     (1-w^) 


^*J      J=k     3(=^^.^ 
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It   follows   from  using    (B.4)    in    (B.l)   that 

J+£+2j  J-t-Jl^-T-K-t-23 

^,J+;+T-K+2i,        2        ,,        ,  2 

•     wo.       ;    -^aI  ^    r( ^ ^.^         (i-w^) 


Hence,  J-^4->2j  T-K 

00      V    00  0  ^0-  -  2 

■  A     v'     ^ 


h   f4l      ^  ^FT^     rrJlil2i;!l)rfIli) 


h7(2y  J^2-*-2j  T-K 

e       E  T-r-  E 


k:o^^!j.k      rc=^^:^^^2i^^^T^ 


2  '*  '  2 


J-t-2j+2 


CO  ^ic  »     ,,^     2      r.J^2.T-K.23:3^Jl2^T;JWi^ 

.         k--o^j'k  -^7^^3  (JlipFl^ 

"o  jH-2j-t-2  -     0 

2       „,J+2+T-K+2j, 
"  ,k  00     w  r( ^ i-) 

since  ^*I:~-t^^  >   1  whenever  T-K  >  2,  which  proves  the  theorem.   Furthermore, 

J-f2fT-K-f2j      _      CJ>2+T-K)^2j      ^     Jf2-t-T-K 
J+2+2J+2        "  tJ+4)+2j  -  J+4        ' 

for  j   =  0,1,2, ... ;  hence, 

^'X^"*'^      ^  ,J+2+T-K,  ,,        T-K-2,  ,,  ^„  r  V  ^   7 

r-~7%r-    <     w   ( — =--; — ^)      =     U'    (1   -!-  ~y— .— ) ,       when  T-K  >  2. 
h,  (2)      -       o*-     J+4     ^  0^  J-^4   '^ 
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Appendix  C  —  The  Risk  Function  for  6 

The  evaluation  of  E (§-§)'(§-§)  begins  with  text  equation  (2.7d) 

(CO)     £(§-§)•  C|-3)  =  tr  covarC§)  +  tr  (bias  §)  (bias  §)  • 

and  the  covariance  and  bias  terms  are  transformed  using  the  transformations 
described  in  Section  3a.  By  (2.7b),  the  covariance  terra  can  be  expressed  as 


(C.l) 


a^[trA  -  trQ(P""^)'Q' 


0     0, 


QCP'b'Q'], 


=     a  [trA  -  tr 


0     Oj 


A] 


Equation   (C.l)   can  be  expressed  as 


(C.2)     =  a^trA- 


2^""^  (2) 


,(2) 


where  d.   are  the  characteristic  roots  of  A_  as  defined  subsequent  to  defi- 
nition (3.7). 

The  bias  term  in  (CO)  i? 

(C.5)     tr  S"'R' (RS"-^R')~-^o6'(RS"'^R')''^RS~-^ 

=  tr[QPS'-^R'(RS'-^R')"^(RP'-^Q'QPe-r)]'Q(P"h'P"'^Q' 


•[QPS"'^R'  (RS"-^R')""^(Rp'-^Q'QPB-r)] 


0  0 


QPa-QPS~'^R' (RS"-^R')  ^r]'A[ 


I  0 
0  0 


QPS-QPS"-^R'  (RS''^R')"'^r] 


We  note  that 
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(C.4)     QPS'-^R'  (RS  ^R')  ^ 


0  0 


QPS"-^R'(RS"''"R')"^ 


Substituting  CC.4)  into  (C.S),  we  have 


CC.5) 


tr  S"-^R'CRS"-^R')"-^<S6'CRS"-^R')~'^RS"^ 


[QPC8-S'-^R'(RS"-^R')"-^r)] 


0     0 

A3  A2. 

.0     0. 

•[QPC6-S"'^R'(RS"''-R')'-^r:)] 


J 

Z 
i=l 


=  _Edf^(Cp^ 


where  E,^   and  d.  ^   are  defined  in  Sections  3a  and  3b. 
Cons^uently,  using  (0.2)  and  (C.5), 


K-J        J 
CC.6)     E(g-6)  •(§-$)  =  a^  E  dp^  -.  Zdp^(£*)^ 
■  '   '  ■       j=l  J     i=l  ^    ^ 
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Appendix  D  —  Bias  and  Covariance  for  6 

In  this  Appendix,  the  bias  and  covariance  matrix  for  |3  are  determined. 
Returning  to  the  definition  of  3  and  its  derived  value  in  (2.11),  and  the 
transformations  of  Section  3, 


-1, 


(D.l)     §  =  b  -  I(o,c)^"^P  Q' 


^J^ 


(0  Oj 

4 

Therefore,  the  expected  value  of  3  is 


^  =  ^  -  P"'Q'^(0,c)(") 


(D.2)     E§  =  §  -  ap'^Q' 


E[I 


^1^1  ^1 
(0,c*)  ^2  a 


], 


since  c  >  u  =  ----^  and   "  <  — =—  =  c*  as  shown  in  Section  3b. 
Using  Theorem  1  to  evaluate  the  expectation  in  (D.2), 


(D.3)     E6  =  $  -  P  Q' 


loj 


Di^ 


X 


(X,J-t-2)  ^  cJ 


^(T-K) 


T-K 


Noticing  that 


p"iQ, 


n,  =  P"^Q' 


I  0 


(0  OJ 


[QP3-QPS"-^R'(RS"-^R')'-^(R6-5)], 


as  well  as 


,1 


I^  0 


QPS"-^R'(RS"-^R')~-^6  =  QP"-^R'(RS"-^R')"-^5, 


CD. 4) 


^         -1     -1    -1 
E§  =  B  +  S  ^R'(RS  R')   6  Pr 


•(X,J-4-2)  cJ_ 
,2  T-K 
'(T-K) 
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and  the  amount  of  bias  depends  on  fi  and  the  level  of  the  test,  a,-  through  c. 
A  small  a  implies  a  larger  c  and  hence  increases  the  probability  that  given 
6  will  be  included  in  EB. 

Turning  now  to  the  covariance  matrix  of  3,  it  can  be  written  as 

(D.5)     Var(6)  =  £(§-3) (3-3)'  -  E(3-3)Efe-0) ' 


f  2 
/ 


=  MSE3  -  [Pr 


"(X,J-^2)   cJ_ 
2        T-K 
^(T-K) 


I *■  rc'-'-D >  rDQ~^Di ■\" 


]*-[S"^R'(RS"-'R')'^5][6'CRS  "R')  *RS  ^], 


^Df\'*DC"^l 


but 


QPS"-^R'CRS"'^R')"'^'S  = 


r 


0  0 


n. 


so 


(D.6)     Var  3  =  MSE6  -  [Pr 


X 


(X.J-t-2)  ^  Jc 


T-K 


•(T-K) 


]V^Q' 


r         '       ^ 
DlDl  ° 

.  0   0. 


QCP"-^)'. 


From  the  definition  of  w  =  QP[b-S~  R' (RS~  R')'  r] ,  given  after  (3.4), 


the  MSE§  is 

(D.7)     E(3-3)(3-S)'  =  E(b-3)(b-S)'  -  E[(b-3)I,^  ,  (u)w'Q(P"-^) 'R' (RS''^R')"-^RS"-^] 

-   £[r^y^,^(u)(S"^R'(RS'^R')"^RP"^Q'i^)(b-3)'] 

•!  (S"^R'(RS~-^R')"-^RP''^Q'E[I,   .  (u)ww' ]Q(P"'^) 'R' 

\.U,CJ 

(RS'-'-R')"'^RS"'^) 


(D.7') 


2  ~y  -1  l^J  °' 

a^S  ^  -  P  'q'EI   ^,,(u)(w-n)w'  -^ 


QCP"^)' 


-1  '^J  ^ 


[C     Oj 


EI^Q^^^(u)w(w-r,)'Q(P'^)' 


+  P  ^Q' 


Elfo.c)^"^^^' 


0  0 


Q(P'-^)' 
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since 


Q(P''^)'R'(RS"-R')"-^RP'-^Q' 


0  0 


and  S"-^  =  P"-^(P"-^)'  =  P"^Q'Q(P"'^)« 


Making  use  of  the  results  of  Section  3b,  u  =  — -s—  <  c  so   "  <  — =—  =  c*. 


Jo 


a  o 


r:2 


given  a  ,  we  have  for  the  second  term  of  (D.7') 

(Wj-n^wJ   0 


(D.8) 


.P^Iqi 


EI 


CO,c*) 


QCP'b' 


A  similar  argument  holds  for  the  third  and  fourth  terms  of  (D.7')  so 


(D.9)     E(0-e)(8-3)'   =  a^s"-^-a^P"-^Q' 


EI 


CO,c*) 


!i^  0 

a     o 


"ii'h 


2 


+P"-^Q' 


I?l'^O.c*) 
0 

f 

^1 

0 
0. 

.o'\ 

^ 


y?'\'- 


EI 


CO,c*) 


r    '   ^ 
^1^1 


^iDi   0 


.0 


Q(P"b' 


QCP"-^)' 


Q(P"^)'. 


Using  Theorems  1  and  3  of  Appendix  A,  the  risk  matrix  of  3  is 


^  As 


(D.IO)    E(6-6)(3-6)'  =  a'S 


2.-1 


-P""^Q' 


QfP"h 
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or  after  taking  the  expectation  over  a   , 
(D.ll)         E(3-6)(§-3)'     =     a^s"^ 


-P'-^Q' 


EI 


fO'fe^ 


^ 


CT-Kj 


f^  ^J^'3l^lE[I  _j 


■(A^J+4) 


^°'t:k> 


+2n,niE[i 


(T-K)      ^ 
^CX.J+2) 


'CT-K) 


Q(P'^)'. 


Combining  equations   (D.G)   and   (D.ll),   the  variance  of  3  is 

^    .    ^^  2-1-1     fhC2)a2i  >hC4)nin;-2hC2)niD; 

(D.12)  EC3-Ee)(e-E3)'    =  a^S     -P   ^Q' 

-     ".  -     '  0 


'K-J 


QCP"^)' 


■P"-^Q' 


h^{.2)i\^r\[ 


QCP'b'. 


where  r),n,  and  I,  are  (J  x  J)  matrices,  0„  ,  is  a  (K-J)  x  (K-J)  null  matrix 

-1-1        J  K,-J 

and  the  remaining  null  matrices  are  o^-  the  proper  order  for  conformability 
in  multiplication. 


-45- 


References 


Ashar,  V.G.  (1970):  "On  the  Use  of  Preliminary  Tests  in  Regression,"  unpub- 
lished thesis,  North  Carolina  State  University,  Raleigh. 

Bancroft,  T.A.  (1944):   "On  Biases  in  Estimation  Due  to  the  Use  of  Preliminary 
Tests  of  Significance,"  Annals  of  Mathematical  Statistics,  Vol.  15,  pp. 
190-204. 

Bancroft,  T.A.  '<1964)  :  "Analysis  and  Inference  for  Incompletely  Specified 
Models  Involving  the  Use  of  Preliminary  TestCs)  of  Significance,"  Bio- 
metrics, Vol.  20,  pp.  427-442. 

Chipman,  J.S.  and  M.M.  Rao  (1964):  "The  Treatment  of  Linear  Restrictions  in 
Regression  Analysis,"  Econometrica,  Vol.  32,  pp.  198-209. 

Cohen,  A.  (1965) :  "Estimates  of  the  Linear  Combination  of  Parameters  in  the 
Mean  Vector  of  a  Multivariate  Distribution,"  Annals  of  Mathematical 
Statistics,  Vol.  4u,  pp.  78-87. 

Huntsberger,  D.V.  (1955) :  "A  Generalization  of  a  Preliminary  Testing  Proce- 
dure for  Pooling  Data.,"  Annals  of  Mathematical  Statistics,  Vol.  26,  pp. 
734-743. 

James,  W.  and  C.  Stein  (1961):  "Estimation  v;ith  Quadratic  Loss,"  Proceedings 
of  Fourth  Berkeley  Symposium  on  .Mathematical  Statistic  Problems,  Univer- 
sity of  California  Press,  Berkeley,  pp.  361-379. 

Kitagawa,  T.  (1963):  "Estimation  After  Preliminary  Tests  of  Significance," 

University  of  California  Publications  in  Statistics,  Vol.  3,  pp.  147-186. 

Larson,  H.J',  and  T.A.  Bancroft  (1963a)  t  "Sequential  Model  Building  for  Pre- 
diction in  Regression  Analysis,"  Annals  of  Mathematical  Statistics,  Vol. 
34,  pp.  462-479. 


-46- 

Larson,  H.J.  and  T.A.  Bancroft  (1965b):  "Biases  in  Prediction  by  Regression 
for  Certain  Incompletely  Specified  Models,"  Biometrika,  Vol.  50,  pp. 
391-402. 

Lindgren,  B.W.  (1968):  Statistical  Theory,  The  MacMillan  Co.,  London,  p.  380. 

Hosteller,  F.  (1945):  "On  Pooling  Data,"  Journal  of  the  American  Statistical 
Association,  Vol.  43,  pp.  231-242. 

Rao,  C.  (1945):  "Markoff's  Theorem  with  Linear  Restrictions  on  Parameters," 
Sankhya ,  Vol.  7,  pp.  16-19. 

Sacks,  J.  (1963):  "Generalized  Bayes  Solution  in  Estimation  r  oblems,"  Annals 
of  Mathematical  Statistics.  Vol.  34,  pp.  751-768. 

Sawa,  T,  and  T.  Hiromatsu  (1971):  "Miniraax  Regret  Significance  Points  for  a 
Preliminary  Test  in  Regression  Analysis,"  Technical  Report  39,  Institute 
for  Mathematical  Studies  in  the  Social  Sciences,  Stanford  University. 

Sclove,  S.L.,  C.  Morris  and  R.  Radhakrishnan  (1970):  "Non  Optiraality  of  Pre- 
liminary-Test Estimators  for  the  Multinormal  Mean,"  to  appear  in  Annals 
of  Mathematical  Statistics. 

Stein,  C.  (1966):  "An  Approach  to  the  Recovery  of  Interblock  Information  in 
Balanced  Incomplete  Block  Designs,"  Research  Papers  in  Statistics: 
Festschrift  for  J.  Neyman,  F.N.  David  (ed,),  John  Wiley  and  Sons,  Inc., 
New  York,  pp.  351-366. 

Strawderman,  W.E.  and  A.  Cohen  (1971):  "Admissibility  of  Estimators  of  the 
Mean  Vector  of  a  Multivariate  Normal  Distribution  with  Quadratic  Loss," 
Annals  of  Mathematical  Statistics,  Vol.  42,  pp.  270-296. 

Theil,  H.  (1963):  "On  the  Use  of  Incomplete  Prior  Information  in  Regression 
Analysis,"  Journal  of  the  American  Statistical  Association,  Vol.  58, 
pp.  401-414. 


-47- 

Theil,  H.  and  A.S.  Goldberger  (1961):  "On  Pure  and  Mixed  Regression  Estima- 
tion in  Economics,"  International  Economic  Review,  Vol.  2,  pp.  65-78. 

Toro-Vizcarrondo,  C.  and  T.D.  Wallace  ^1968)  :  "A  Test  of  the  Mean  Square 
Error  Criterion  for  Restrictions  in  Linear  Regression,"  Journal  of  the 
American  Statistical  Association,  Vol.  63,  pp.  558-572. 

Wallace,  T.D.  (1971):  "Weaker  Criteria  and  Tests  for  Linear  Restrictions  in 
Regression,"  to  appear  in  Econometrica. 


■^UND^ 


